01/08/2019

Motivation

Background

“Textbook” sample size caclulation for a normal endpoint:

\[ \begin{aligned} \min_{n \in \mathbf{N}} ~ & n \\ \text{subject to } ~ & g(n, \mu, {\color{red}{\sigma}}) \geq {\color{blue}{1 - \beta^*}} \end{aligned} \] \(g(n, \mu, {\color{red}{\sigma}})\) - power of the trial.

\({\color{red}{\sigma}}\) - an unknown nuisance parameter.

\({\color{blue}{1 - \beta^*}}\) - a power threshold.

Incoherence

Minimial Clinically Imortant Difference: \(\mu = 0.3\)

Power threshold: \(1 - \beta^* = 0.8\)

Universe A: \(\hat{\sigma} = 1 \rightarrow n = 175\)

Universe B: \(\hat{\sigma} = 1.3 \rightarrow n = 296\)

Same effect to be detected, same power, but different sample size.

More generaly, as nuisnace parameter varies, so does the amount we are willing to invest in a study to get 80% power to detect \(\mu = 0.3\).

Sample Size Samba

drawing

Sample Size Samba

drawing

Sample Size Samba

Sample size re-estimation

  • Suppose we have an initial estimate \(\hat{\sigma}\) = 1
  • Then we choose \(n = 175\) for 80% power to detect \(\mu = 0.3\)
  • Bu twe then get an interim estimate of \(\hat{\sigma} = 1.3 \rightarrow\) inflate sample size to \(n = 296\)…

No flexibility to samba - already declared \(\mu = 0.3\).

\(\rightarrow\) Sample size re-estimation is incoherent.

Methods

Proposal

We should:

  • design trials by considering both costs (sample size) and benefit (power);
  • do so in an explicit, transparent way;
  • use the same methodology for an initial sample size calculation as for a re-estimation.

This would:

  • make sample size re-estimation a coherent procedure, and so give us a useful tool for dealing with nuisance parameter uncertainty;
  • may elimiante the need for SSR altogether.

Value function

Choose \(n\) to maximise value, denoted \(v(n, \sigma)\), a weighted sum of power and sample size:

\[ \max_{n} v(n, \sigma) = g(n, \sigma) - \lambda n \]

Implicit assumptions about value:

  • Linear in sample size;
  • Linear in power;
  • Sample size and power are preferentially independant - i.e. our preferences about sample size are independant of power, and vice versa.

Illustration

Two arm parralel group trial comparing group means of normally distributed outcome.

Difference to detect: 0.3

Best guess of standard deviation: 1

Trade-off parameter \(\lambda\): 0.0022

(\(\rightarrow n = 175\) under both frameworks)

Illustration

Illustration

Fixed designs

\(\rightarrow\) A value-based approach will lead to a coherent framework for sample size re-estimation, with less variability in \(n\) but more variability in power.

But, we can go further - in some cases, we don’t need to do re-estimation at all.

Fixed designs

Example

Example: cluster RCT

\[ n = n_i(\sigma_t) [1 + (m-1)\rho] \] \(n_i(\sigma_t)\) = sample size for an individually randomised trial with no clustering and the same total variance, \(\sigma_t^2\).

\(\rho =\) Intacluster correlation coeffieint (ICC) - proportion of the total variance due to variability between clusters.

\(m =\) number of participants per cluster

$k = $ number of clusters

Example: cluster RCT

Example: cluster RCT

Number of clusters: 15 Number of participants: 470

Example: cluster RCT

Number of clusters: 18 Number of participants: 500

Discussion

Summary

  • Sample size calculations can be incoherent when nuisance parameters are unknown.
  • Value-based trial design can lead to coherent decisions and facilitate sample size re-estimation.
  • In some cases, re-estimation can be avoided altogether.

Further work

  • How to choose optimal fixed designs?
  • Do our value function assumptions hold?
  • Would precision be a better measure of value than power?
  • Are ‘underpowered’ trials unethical?

Thank you

@DTWilson, D.T.Wilson@leeds.ac.uk, https://github.com/DTWilson/Robust_value